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METHODS AND SYSTEM FOR ENCODING AN AUDIO SEQUENCE 
5 WITH SYNCHRONIZED DATA AND OUTPUTTING THE SAME 



FIFLD OF THF INVgNTIONI 

10 The present invention relates to audio sequences, and, more 

particularly, to the encoding or an audio sequence witn synchronized data, 
and the ou^ut of such an encoded file. 

BACKGROUND OF THF INVENTION 

15 Karaolce is a musical performance method in which a person (i.e., the 

singer) performs a musical number by singing along \m\h a pre-recorded song 
through the reading of that particular song's lyrics, which are preferably 
displayed on a display device, such as, for example, a television screen 
situated within view of the singer The singer's voice ovenides the voice of 

20 the original singer of the pre-recorded song. A video motion picture, often 
referred to as a music video, may also typically be displayed as an 
accompaniment to both the music and the singer. Devices providing this 
opportunity are known as karaoke musical reproduction devices, and will be 
referred to as karaoke devices. 

25 Current karaoke devices use tapes, compact disks (CDs), digital 

videodisks (DVDs), computer disks, video compact disks (VCDs) or any other 
type of electronic medium to record and play both the music and the lyrics. 
With the rise in popularity of karaoke as an entertainment means, more and 
more songs are put in karaoke format. As a result, the need to transport and 

30 store these ever-growing musical libraries has become paramount. In some 
instances, digitized data representing the music and the lyrics has been 
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compressed using standard digital compression techniques. For example, 
one popular current digital compression technique employs the standard 
compression algorithm known as Musical Instrument Digital Interface (MIDI). 
U.S. Patent No. 5,648,628 discloses a device that combines music and lyrics 
5 for the purpose of karaoke. The device in the '628 Patent uses the standard 
MIDI format with a changeable cartridge which stores the MIDI files. 

The Intematlona! Organization for Standardization (ISO/IEC) has 
produced a number of generally known compression standards for the coding 

of motion pictures and associated a udio data Thig fttg^n^^rd K refenred to as 

10 the MPEG (Motion Picture Experts Group) standard. The MPEG standard is 
defined In documents ISO/ lEC 1 1172 (which defines the MPEG 1 standard) 
and ISO/IEC 13818 (which defines the MPEG 2 standard), both of which are 
incorporated herein by reference. Another popular, non-standard 
compression algorithm, which is based on the MPEG 1 and MPEG 2 
15 standards, is referred to as MPEG 2,5. These three MPEG versions (MPEG 
1, MPEG 2, MPEG 2.5) will be collectively referred to as "MPEG 1/2." U.S. 
Patent No. 5.856,973 discloses a method for communicating private 
application data along with audio and video data from a source point to a 
destination point using the MPEG 2 format 
20 MPEG 1/2 is further broken into a number of "layers." In general, the 

higher an MPEG 1/2 layer is labeled, the more complexity is involved. MPEG 
1/2 Layer III (MP3) is an emerging popular compression format, which may be 
used for encoding audio data in an effort to produce near-CD quality results. 
MPS players are portable devices, typically containing a "flash" 
25 memory, a liquid crystal display (LCD) screen, a control panel and an output 
jack for audio headphones and other similar devices. Musical compositons 
are loaded into the "flash" memory of the MPS player through connection to a 
personal computer (PC) or other similar device, and played for personal 
enjoyment 
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The MP3 standard defines an "audio sequence," which Is brolcen down 
into variable size Trames," which are further broken down Into 'flelds." 
Although the syntax of each frame Is described in the MPS standard, the 
content of the fields within each frame is not defined and is the subject of the 
5 present invention. 

Typical karaoke devices are large, complex expensive systems used in 
bars and nightclubs. They involve large display screens, high fidelity sounct 
systems and a multitude of storage media, such as, for example, CDs. 

Tvplcal MP 3 Players are small and affordable, but are designed to simply play 

1 0 music. They have small display screens to display only the title and play time 
of a song, limited audio output to a headphone, and minimal Of any) 
microphone. 

Typical MPS players do not currently possess the ability to synchronic 
a data field, containing lyrical infbmiatlon of a song, with an audio signal, 
1 5 containing the musical aspect of the song, into a single audio sequence file 
that can be stored, manipulated, transported and/or played via a karaoke 
player device. 

Accordingly, it would be desirable to have a program and method that 
overcomes the above disadvantages. 
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R RiPF nF5^nRiPTinN ofthf drawings 

FIG. 1 is a block diagram illustrating the syntax of the MP3 audio 
sequence, as described in the MP3 specification standard; 

FIG. 2 is a schematic diagram of an MP3 encoder, as described in the 
5 MPS specification standard; 

FIG- 3 is a schematic diagram, illustrating a modified MP3 encoder. In 
accordance with the present invention, to embed karaoke data with an audto 
signal to form an MP3 audio sequence; 

FtO^Aillustcates a flow chart of the encodlng.prQcess, In accordance — 

10 with the present invention; 

FIG. 5 Is a schematic diagram of an MP3 decoder, as described in the 
MPS specification standard; 

FIG. 6 is a schematic diagram, illustrating a modified MPS decoder, 
made in accordance with the present invention, to un-embed karaoke data 
15 and an audio signal from an MPS audio sequence; 

FIG. 7 Illustrates a flow chart of the decoding process. In accordance 
with the present invention; and 

FIG. 8 illustrates a block diagram showing the MPS karaoke player 
apparatus. 

20 Corresponding reference characters indicate conresponding parts 

throughout the several views. The exemplifications set out herein illustrate 
one preferred embodiment of the invention, in one form, and such 
exemplifications are not to be construed as limiting the scope of the invention 
in any manner. 
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DETAILED DESCRIPTION OF THE 
PRFSFNTI Y PRFFFRRFn FMROniMFKITR 

In the present invention, a prefenred embodiment for encoding an 
audio sequence with synchronized data takes place according to the MP3 
5 standard, as described above. Alternatively, the encoding process described 
below may be perfomied according to the confines of other similar standards. 
These other standards may include, for example, MPEG 1/2 Layer III, AC-3, 
Microsoft's Windows Media Audio file (WMA), Advanced Audio Coding 
(AAC), Lucent Technology's Perceptual Audio Coder (EPAC), Liquid Audio, 
~T0 reai.com s Gz, ana otner rrame oased auaio Tonmat standards, hor purposes 
of this invention. MPEG 1/2 Layer III means MPEG 1, MPEG 2 and MPEG 2.5 
Layers 1 and 2 fomriats. Therefore, the present invention is applicable to any 
frame-based audio format. 

As mentioned above, the MP3 standard defines an "audio sequence." 
15 A typical audio sequence of the MPS standard is illustrated in FIG. 1 . The 
audio sequence 10 (shown in more detail in of FIG. 1-A) Is broken into 
variable size ^frames" 12. An example of one frame of the audio sequence is 
shown in FIG. 1-B. 

Each frame is then further broken down into a plurality of fields 14 and 
20 sub-fiekls 1 6. Examples of some of the fields 14 and sub-fields 1 6 of the 
frame 12 shown in FIG. 1-B are Illustrated FIGS. 1-C, 1-0 and 1-E. In the 
preferred embodiment, each frame 12 of the audio sequence 10 includes a 
fixed format made up of a header field, an error check field, a main data field 
and an ancillary data field. Furthennore, each of the fields 14 are broken 
25 down further Into sub-fields 16, an example of which is shown within the 
divisions of FIGS. 1-C, D and E. Although the syntax of each frame 12 is 
described in the MP3 standard, the content of both the fields 14 and the sub- 
fields 1 6 within each frame 12 are not defined within the MPS standard. In 
addition, the private bits defined in both the header and the audio data 
30 frames, as well as the ancillary data frame, can be used to encode lyrical data 
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and control signals, or cues to lyrical data and control signals, within the audio 
sequence 10, such that it is synchronized with the audio signal upon the 
formation of the audio sequence 10. 

It is important to note that the header fields for each frame 12 occur 
5 within a fixed period and are a specific size. The data fields associated with 
each frame 12, however, are of variable size and do not occur within a fixed 
period. 

More particulariy, the present invention concems using the private bit 

in-the^header-fieid-(ElG^t=E^eld-a) ^e privat e b i t s i n t he ma i n data field 

10 (FIG. 1-C. Field 2) and the ancillary data field (FIG. 1*D) to be embedded with 
lyrical text, video, cues to lyrical text or video, and/or control information. This 
Informatiori will be collectively referred to as karaoke data. It should be noted 
that each frame may or may not include any karaoke data. 

if a frame does include karaoke data, such data may be stored within 
15 any or all portions of the available data fields mentioned above. Preferably 
the above*described information will be stored within the data fields in the 
foliowng order, first, the private bit in the header field; second, the private bits 
in the main data field; and third, the ancillary data field. 

FIG. 2 shows a high level block diagram of an MPS encoder as 
20 described in the MPS specification. As mentioned above, karaoke data may 
be encoded in the private bit of the header field, the private bits in the main 
data field, or within the ancillary data. FIG. 3 illustrates a high level block 
diagram of a modified MPS encoder used to encode the karaoke data. The 
frame packing stage of the encoder must be enhanced to synchronize 
25 incoming audio data with karaoke data to pack the frames accordingly. This 
is done by sending in tags and control information with the karaoke data. The 
"complex frame packing" unit uses this information to sequence the karaoke 
data with the audio samples appropriately. FIG. 4 illustrates a flow chart 
detailing the encoding process of the present invention, with a focus on frame 
30 packing the karaoke data. Additionally, FIG. 5 illustrates a high level block 
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diiagram of an MP3 decoder, as described in the MP3 specification. FIG. 6 
Illustrates a high level block diagram of a modified version of the MPS 
decoder. FIG. 7 describes a fiow chart of the decoding process with a focus 
on karaoke data unpacking. During the decoding process, the karaoke data 
5 Is produced during the frame unpacking stage while the audio data is 

produced as a final product of the inverse mapping stage. The karaoke data 
Is then sequenced with the audio data external to the decoder. 

With reference to FIGS. 1-4, a method of encoding an audio sequence 
is provided for, as follows. According to the present invention, an encoder 

10 receives both an audto sample and a data sample (step 100). Preferably, the 
encoder is a system that is developed to synchronously encode an audio 
sample with a data signal, creating an audto sequence. In the preferred 
embodiment the audio sample is a musical composition. Alternatively, the 
audio sample may be an oral signal, such as, for example, an audio version 

15 of a text, such as, for example, a t>ook, a newspaper or a foreign language 
textbook. In the preferred embodiment the data sample may be the words to 
a musical composition. Alternatively, the data sample may be an oral version 
of a text, such as, for example, an audio version of an English language text 
or video data, corresponding to, for example, a music video of tiie song 

20 embodied in the audio sample. 

After receiving the audio sample and the data sample, the encoder 
then converts the audio sample into an audio signal (not shovm). Preferably, 
the conversion process assures that the audio signal will be able to be read 
and understood according to the preferred fonmat of the audio sequence. For 

25 example, if the format of Oie audio sequence is MP3, then the audio signal 
will preferably be able to be read according to ttie MPS format 
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In much the same way, the data sample is converted into a data signal 
(step 102). Further, the data signal may include a plurality of data segments. 
Each of the data segments preferably corresponds to a portion of the data 
sample, such that it may be embedded into the resultant audio sequence. 

5 Not all portions of the data signal need be encoded within the data segments. 
Rather, each of the data segments may contain a fractional portion of the 
data signal corresponding to the data signal. 

For example, if the data sample contains the words to a song, the data 

signaUvould Include various data segments, each segment corresponding to. 

1 0 for example, a word or a beat. The purpose for this, which will be described 
In more detail below, allows the data segment to be embedded into the audio 
sequence, both in an order and in a location such that the data signal 
corresponds to the audio signal (i.e., in such a manner that the data signal is 
synchronized to the audio signal). 

15 The data signal may also Include a control signal. Preferably, the 

control signal contains information relating to the order of embedding of the 
data signal within the audio sequence. For example, the control signal may 
dictate that, during the encoding process, one particular word of the lyrics 
contained within the data signal may contain three syllables, each salable 

20 requiring position at a different beat of the song. Such infonmation would be 
preferably contained within the control signal. 

After converting both the audio signal and the data signal, the audio 
sequence is then encoded. The audio sequence consists of the audio signal, 
as converted above, embedded with the data signal, also as converted 

25 above, in such a way that the data signal is synchronized with the audio 

signal. This synchronization preferably occurs by embedding, into one of the 
frames of the audio sequence, one of the data segments. 
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More particularly, the encoding process occurs preferably in the 
following manner First, the audio signal is mapped into a plurality of audio 
segments (step 105). These audio segments, which are similar in nature to 
the above-described data segments, preferably correspond to one beat of the 
5 song. After the control signal is encoded and included within the data signal, 
each audio segment is packed into one of the frames of the audio sequence 
(step 110). Additionally, one of the data segments is packed into the frames 
of the audio sequence, such thiat the data segment corresponds to the audio 

segment packed into the frame of the audio sequence. 

1 0 Preferably, the sequence of encoding is such that the data segments 

are embedded into the audio sequence in the private bit in the header field 
first (step 115). Upon filling that private bit, any future data segments are 
preferably embedded into the private bit in the main data field (step 120). If 
both of the private bits are filled, then any remaining data segments would be 

1 5 embedded into the andilary data field (step 125). 

It should be noted that the data signal is embedded into a lower level 
of the audio sequence (i.e., the fields and sub-fields), as opposed to a high 
level, such as v^thin the frames themselves, in this way, all the embedded 
data will be supported by standard MPEG decoders, and no additional 

20 circuitry will be needed to capture the data. 

In operation, for example, assuming the musical composition to be the 
musical composition "Layla," the audio sample would contain the music to the 
composition. The data sample would be the lyrics to the composition. Both 
samples are then converted to, for example, MP3 formats. During the 

25 encoding process, the lyrics to the song would be separated in accordance 
with the beat or tempo of the music. Thus, the first line of the song ("What 
would you do if you get lonely?*) would be separated into the first nine beats 
of the music, one for each syllable. The data signal and the audio signal 
would then be encoded to form the audio sequence in a manner such that the 

30 frame containing the first beat would also contain the first word, and so on. 
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Alternatively, in an alternative emtK)diment, and in lieu of encoding the 
audio sequence with the data, the audio sequence may be encoded with a 
series of pointer signals. The pointer signals refer to the data signal, which, in 
this embodiment, is stored in a separate file. Additionally, the pointer signals 
5 reference the data signal in accordance with the instructions contained within 
the control signal, and are synchronized in the same way as the data signal is 
synchronized in the preferred embodiment (i.e., the pointer signals would 
refer to the data signals In such a way that the audio sequence is 

10 encoded In such a manner that the frame containing the first beat would also 
contain a pointer referencing the separate data file. 

After the encoding process has taken place, the audio sequence may 
be outputted to either a karaoke player, or to any presently known storing 
medium for play at a future time (step 130). With reference to FI6S.1-7, a 

15 method of outputting an audio signal having a synchronized data signal is 
provided. The audio sequence, encoded preferably in the manner set forth 
above, is provided (step 200). Contained within the audio sequence is a 
compressed audio signal. This compressed audio signal corresponds to the 
audio signal, described above, which contains the song portion of the musical 

20 composition. Additionally provided is a compressed data signal, 
corresponding to the lyrical portion of the musical composition. The 
compressed data signal may be located within the audio signal, or within a 
separate data file (in which case, the audio sequence may include the pointer 
signals), as described above. At this point, the compressed data signal is 

25 cun-ently synchronized with the compressed audio signal. The compressed 
data signal is then unpacked and stored in a buffer (steps 205, 210, 215). 
The compressed audio signal is also unpacked. Both signals are then 
synchronously outputted to an output device, which may be, for example, a 
karaoke player system (steps 220. 225). Alternatively, the output device may 

30 be a speaker, a stereo system, a video system or any other similar device. 
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Turning now to a discussion of the apparatus, FIG. 8 shows a blocl< 
diagram of an MP3 karaolce player de>^ce. Referring to FIG. 8, in conjunction 
with FIGS 1-7, the Interface Port 50 preferably interfaces to an external 
storage source, preferably through a docking station or cable. The Interface 
5 Port 50 is used to transfer ".mpS" files from the external source to the karaoke 
player device to be stored in the karaoke player device's Flash Memory 52. 
The extemal storage source may be a Personal Computer or other similar 
extemai device. 

The Flash Memory 52 Is used to store one or more ",mp3" files to be 

1 0 played by the MPS karaoke player. This type of memory can be overwritten 

with new information, but will "remember" any files that are stored in it until it 

is overwritten on purpose. 

The Memory Controller 54 is used to coordinate the interface between 

the Interface Port 50 and the Flash Memory 52, between the Flash Memory 
15 52 and the MP3 Decoder 56. and between the Hash Memory 52 and the LCD 

controller 58. Addittonally, the Memory Controller 54 is preferably used to 

interface to the person using the karaoke player device through the Button 

Controls 60. 

The MPS Decoder 56 provides the function as described above. That 
20 is, decodes the MPS karaoke file, (i.e., the ".mpS file**), and outputs audio 
data to the Audio Mixer 62 and karaoke data to the LCD/karaoke Control 58. 

The LCD/karaoke Control 58 has several functions. First, it controls 
the LCD display to display text and lyrics, highlight words, and scroll lines of 
text. The LCD/Karaoke Control 58 also sends video cues received from the 
25 MPS Decoder 56 to the Video Out Cue Jack 64 for extemal processing. 

Finally, it controls the Audio Mixer 62 to allow the person using the device's 
voice to over-ride the singers' voice in the original song. 
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The Button Controls 60 allow the person using the device to control 
operation of the karaoke player device. Preferably, the button controls 60 
include buttons for Play. Fonward, Reverse, Pause. Stop, as well as other 
basic functions. The button controls 60 allow the user to select a specific 
5 song to play and/or sing along virfth, skip songs, pause or otherwise 
manipulate the songs according to the user's desires. 

The Video Out Cue Jack 64 is provided to interface with an external 
device controlling the display of a music video. It is also used to send signals 

belng-decoded^txy^tha4i4P3^decodeii^64a-thls-extenialdevlce.to-sequence 

1 0 the music video along with the file being played by the MP3 karaoke player. 

The LCD Display 66 provides the visual Interface to the person using 
the karaoke player device. The LCD display 66 is large enough and flexible 
enough to display several rows of text, highlight text, scroll lines of text, etc. 
The LCD display 66 also provides karaoke functtonality. The display 66 is 
1 5 preferably flexible enough to display characters In many languages, as the 
song playing may be In a different language than the display shows. 

The Audio Mixer 62 is used to mix the source audio provided by the 
MPS Decoder 56 with the voice of the person using the device from the 
microphone 68. The user's voice over-rides the singer's voice in the original 
20 audio. The output of the Audio Mixer 62 is preferably sent to both a 

Headphone Jack 70 and an Audio Out Jack 72, preferably through a Digital to 
Analog Converter 74. 

Finally, the Microphone 68 allows the person using the device to sing 
along with the musical composition as it is played, guided by the lyrics 
25 displayed on the LCD Display 66. 

It should be appreciated that the embodiments described above are to 
be considered in all respects only illustrative and not restrictive. The scope of 
the invention is indicated by the following claims rather than by the foregoing 
description. All changes that come within the meaning and range of 

30 equivalents are to be embraced within their scope. 
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WE CLAIM: 

1 . A method of encoding an audio sequence with synchronii^d 
data, comprising the steps of: 

providing an audio sample and a data sample; 

converting the audio sample Into an audio signal; 

converting the data sample Into a data signal, the data signal 
including a plurality of data segments; and 

encodlno the aud io signal w it h t h e data signal to fonm an audio 



1 0 sequence, the audio sequence including a plurality of frames, each frame 
Including at least one field for receiving at least one data segment of the data 
signal. 

2. The method of Claim 1 , wherein the data signal further Includes 
15 a control signal; and further comprising the step of: 

encoding the audio sequence in accordance with instructions 
contained within the control signal. 

3. The method of Claim 2, further comprising the step of outputting 
20 the audio sequence. 

4. The method of Claim 1 , wherein the audio sequence Is provided 
in a format selected from the group of formats consisting of MPEG 1/2 i^yer 
1/2, AC-3, WMA. AAC, EPAC, Uquld and G-2 fomiats. 



25 



5. The method of Claim 1 , wherein the data sample further 
includes text data. 



30 



6. The method of Claim 1 , wherein the data sample further 
includes video data. 
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7. The method of Claim 1 , wherein the audio sample comprises a 

song. 

5 8. The method of Claim 1 , wherein the audio sample comprises 

spoken voice. 

9. The method of Claim 1 . wherein the encoding process further 

Gomprises-the-following-steps: ^ — : 

10 mapping the audio signal into a plurality of audio segments; 

encoding a control signal, the control signal being included 
within the data signal; 

packing each audio segment into one of the frames of the audio 
sequence; and 

1 5 packing each data segment into one of the frames of the audio 

sequence containing a corresponding audio segment in accordance with 
instructions contained within the control signal. 

10. A program for encoding an audio sequence with synchrpnized 
20 data from a data signal, comprising: 

computer readable program code that provides an audio sample 
and a data sample; 

computer readable program code that converts the audio 
sample into an audio signal; 
25 computer readable program code that converts the data sample 

into a data signal, the data signal including a plurality of data segments; and 

computer readable program code that encodes the audio signal 
with the data signal into an audio sequence, the audio sequence including a 
plurality of frames, each frame including at least one field for receiving at least 
30 one data segment of the data signal. 

-14- 
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11. A method of encoding an audio sequence with synchronized 
data, comprising the steps of. 

providing an audio sample and a data sample; 
5 converting the audio sample into an audio signal; 

converting the data sample Into a data signal, the data signal 
including a plurality of data segments; and 

encoding the audio signal with a plurality of pointer signals to 

form an au dio sequence, each pointer sig n al referencing at least one data 

10 segment of the data signal. 

1 2. The method of Claim 1 1 . wherein the data signal further 
includes a control signal; and further comprising the step ot 

encoding the audio sequence in accordance with instructions 
15 contained within the control signal. 

13. The method of Claim 12, further comprising the step of 
outputting the audio sequence. 

20 14. The method of Claim 1 1 . wherein the audio sequence is 

provided In a format selected from the group of formats consisting of MPEG 
1/2 Layer 1/2, AC-3. WM/V, AAC, EPAC. Liquid and G-2 formats, 

15. The method of Claim 1 1 , wherein the data sample further 
25 includes text data. 

1 6. The method of Claim 1 1 , wherein the data sample further 
includes video data. 
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17. The method of Claim 11, wherein the audio sample comprises a 

song. 

5 18. The method of Claim 1 1 , \Artierein the audio sample comprises 

spoken voice. 

19. The method of Claim 12, wherein the encoding process further 

Gomprises-the-followlng^eps; — — 

10 mapping the audio signal into a plurality of audio segments; 

encoding a control signal, the control signal being included 
within the data signal; 

packing each audio segment into one of the frames of the audk> 
sequence; and 

15 packing into each audio segment one of the pointer signals, 

each pointer signal referencing one of the data segments of the data signal. 

20. A program for encoding an audio sequence with synchronized . 
data, comprising: 

20 computer readable program code that provides an audio sample 

and a data sample; 

computer readable program code that converts the audio 
sample into an audio signal; 

computer readable program code that converts the data sample 
25 into a data signal, the data signal including a plurality of data segments; and 
computer readable program code that encodes the audio signal 
with a plurality of pointer signals to form an audio sequence, each pointer 
signal referencing at least one data segment of the data signal. 
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21 . A method of outputting an audio signal having a synchronized 
data signal, comprising the steps of: 

providing an audio sequence with synchronized data, the audio 
5 sequence including a compressed audio signal; 

providing a compressed data signal, the compressed data signal 
being synchronized with the compressed audio signal; 

unpacking the compressed data signal; 

storing the data signal in a buffer . • . - 

1 0 unpacking the compressed audio signal from the audio 

sequence; and 

outputting the audio signal and the data signal to an output 

device. 

1 5 22. The method of Claim 21 . further comprising the step of 

unpacking the compressed data signal from the audio sequence. 

23. The method of Claim 21 , wherein the audio sequence further 
includes a plurality of pointer signals, each pointer signal referencing the 

20 compressed data signal. 

24. The method of Claim 21 , wherein the audio sequence is in MP3 
format. 

25 25. The method of Claim 21 , wherein the audio signal is a signal 

selected from the group consisting of a song and a spoken voice, and 
wherein the data signal is a signal selected from the group consisting of text 
and a spoken voice. 
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26. The method of Claim 21, wherein the output device is a device 
selected from the group consisting of a speaker, a stereo system, a karaoke 
system and a video system. 

5 

27. A program for outputting an audio signal having a synchronized 
data signal, comprising: 

computer readable program code that provides an audio 

sequence-witl^^synchronized-data^the-audiasequenceJwiuding-a 

10 compressed audio signal; 

computer readable program code that provides a compressed 
data signal, the compressed data signal being synchronized with the 
compressed audio signal; 

computer readable program code that unpacks tlie compressed 

15 data signal; 

computer readable program code that stores the data signal in a 

buffer, 

computer readable program code that unpacks the compressed 
audio signal from the audio sequence; and 
20 computer readable program code that outputs the audio signal 

and the data signal to an output device. 

28. The program of Claim 27. further comprising: 

computer readable program code that unpacks the compressed 
25 data signal from the audio sequence. 

29. The method of Claim 27. wherein the audio sequence further 
includes a plurality of pointer signals, each pointer signal referencing the 
compressed data signal. 

30 



-18- 



wo 01/61684 



PCTAJSOO/31338 




SUBSTITUTE SHEET (RULE 26) 



wo 01/61684 



PCT/USOO/31338 



Fig. 2 

(Prior Art) 



2/5 



110 



130 



AUDIO 
SAMPLES 



MAPPING 



QUANTIZER 
AND CODING 



FRAME 
PACKINGJ4 



ENCODED 
BIT STREAM 



105' 



PSYCHO- 
ACOUSTIC 
MODEL 



ANCILLARY DATA- 



ISO/IEC MP3 
ENCODER 



Fig. 3 



110-125 



130 



AUDIO 

samples" 



MAPPING 



QUANTIZER 
AND CODING 



COMPLEX 
FRAME 
PACKING 



105 



PSYCHO- 
ACOUSTIC 
MODEL 



102-^ 



KARAOKE DATA. 
TAGS .CONTROL" 



ENCODED 
BIT STREAM 



MODIFIED 
ISO/IEC MP3- 
ENCODER 



SUBSTITUTE SHEET (RULE 26) 



wo 01/61684 PCTAJSOO/31338 



Fig. 4 



3/5 



AUDIO SAMPLES 
PRESENTED TO 
ENCODER 



I 



100 



102 



MAPPING, QUANTIZATION 

CODING. AND 
PSYCHO-ACOUSTIC MODEL 



X 



.105 



KARAOKE DATA. TAGS, 
AND CONTROL PRESENTED 

TO FRAME PACKING UNIT 

I 



AUDIO DATA 
PRESENTED TO 
FRAME PACKING 
UNIT 



110 



TAGS decoded! 
1 



WAIT FOR CORRESPONDING 
AUDIO DAT A TO ARRIVE 



CONTROL DECODED 



115- 
120- 

125- 
130- 



PACK HEADER, USE PRIVATE BIT 
IF SET IN CONTROL 



I 



PACK MAIN DATA, USE PRIVATE 
BITS IF SET IN CONTROL 



I 



PACK ANCILLARY DATA 
IF SET IN CONTROL 



I 



OUTPUT ENCODED DATA FRAMES 



Fig. 5 (Prior Art) 



220 



200^ 

ENCODED 
BIT STREAM 



FRAME 
UNPACKING 



RECONSTRUCT 



INVERSE 
MAPPING 



AUDIO 



-►ANCILLARY DATA 



SUBSTTTUTE SHEET (RULE 20) 



wo 01/61684 



PCTAJSOO/31338 



Fig. 6 



200^ 

ENCODED 
BIT STREAM 



4/5 



200 

L 



FRAME 
UNPACKING 



RECONSTRUCT 



INVERSE 
MAPPING 



AUDIO 



KARAOKE DATA, TAGS, AND CONTROL 



Fig. 7 



205-215 



ENCODED BIT STREAM 
PRESENTED TO 
FRAME UNPACKING 



]205 



200 



HEADER UNPACKED 1 H PRIVATE BIT STORED IN BUFFER 



/ HMAIN data UNPACKE"d1 HPRIVATE BITS STORED IN BUFFER} — 



210 



ANCILLARY DATA UNPACKED 



T 



ANCILLARY DATA 
STORED IN BUFFER 



RECONSTRUCT. 
INVERSE MAPPING 



220 225 



ALL BUFFER 
DATA PRESENT 



OUTPUT AUDIO DATA 



OUTPUT KARAOKE DATA] 



SUBSTITUTE SHEET (RULE 26) 



wo 01/61684 



PCT/US00y31338 



5/5 



oo 



liJ 
Z> 

o 

Z) 

o 
o 

a 
> 




lU 




z 




)PHO 


ACK 


HEAE 


-0 




53 








o 




o 




o 












CD 





o 
a. 

UJ 

ii! 
z 



SUBSTITUTE SHEET (RULE 26) 



INTERNATIONAL SEARCH REPORT 



in»»r*>irfir>na| application No. 
PCTAJSOQ/31338 



A. CLASSIFICATION OF SUBJECT MATTER 

IPC(7) : GIW. 19/00; H04J 1702 

US CL : 704/201. 236. 270. 1,500; 370/493; 709/236 
Accordingto International Patent Clasaifkation ffPO or to hnth natSnnal ria««8#totion and IPC 



FIELDS SEARCHED 



MiBimimi docDmestadon scarohgd (classification system followed by classification symbob) 
U.S. : 704/201. 270. 270.1. 500; 370/493; 709/236 



Documentation wa t rhc d other than miniimim docmnentatioo to the extent that such documents are inrnhidpd in the fields searched 



Electronic data base consulted during the imemaiinnal search (name of data base and. where practicable, search terms used) 
BRS 



DOCUMENTS CONSIDERED TO BE RELEVANT 



Category' 



Citadon of document, with iiidicatic». where apprcy riate. of the relevant passages 



GB 2 323 760 A (NEC CORPORATION) 30 S^jtember 1998 (30.09.9S), entire 
document. 

US 5.2S1.985 BI A (CHAN) 25 January 1994 (25.01.94). column 1. cohmm 4. column 8. 

US 5.732.216 Bl Bl (LOGAN ec al.) 24.Maicfa 1998 (24.03.98). cbhnmu 1-5, columns 
14-15. column 19. column 28. 



US 5.7n.997 A (KAHN et al.) 07 July 1998 (07.07.98). entire document. 
US 5. 856.973 Bl (THOMPSON) 05 January 1999 (05.01.99). entite document 



CRAVOTTA. NICHOLAS. The Internet-audio Revoluti<»i. EON. 03 Febniacy 2000 
(03.02.00). VOL45. iss.3. pp. 101-107, especially pages 10H02. page 105. 



□ Further documents are listed In the contfaiuation of Box C. Q See patent family annex. 



Relevant to claim No. 



1. 3, 5. 8. 9. 10. 21. 
22,27.28 

26 



21, 22. 23. 25. 27. 28. 

29 



7, 8, 17, 18 
7, 17 

1. 2. 3, 5, 6. 9-13. 15. 
16. 19-24. 27. 28, 29 



4. 7. 8. 14. 17, 18. 26 
4. 14 



* SpecU catcgorki ot dtcd 

•A- docnmeid (teTmrng th« gowril state of the «1 wfakk b doc considered to be 
of fiMiicalir rekvanct 

'E" earlier appllcailoi or puan pnWhhwi on or after the faaanatlonal nitog <faie 

"L* AwMi ii c i il wfakfa may throw dovbu on jriarity clah&Cs) or wfakh b cited to 
cstablbh the pahQcatka date of another chattea or other qiecial reason (as 
specified) 

*0* docwnfw referrtag to an or»l dbclosve, uc, cjihibbloa or other mcau 

■P" ducmaem rn W i di r d prior to ibe iaieniaiiooal filing dae hot lata- ihaa the 
prtority diia claimed 



•X- 



later docmacsi pobfished after the imcnmkmal fiUng due cr prkrily 
dale aadaoi la confflct «Hih lbs aivficatka tai ched to wkraiand the 
priaciple or theory nderlyfag the Invciiiion 

<io < ui nru i of partictilar relevance; tike daimed iovcmioa cannot l>e 
ccssidered novel or casnoi be coosadered to inyolve an ioveotive step 
when the dnaimrni Is taken alone 



ikKui i Mi i i of portknlar relevance; the elahned invention caanoi be 
cons i deied lo hneltn an iamthm aep ^Aimm iHu, Am ly 



brag olwioas to a person skilled fai the an 
dncwmnrt mmtwr of die sanoe palest fanuly 



Date of the actiial completion of the international search 
27 Febniary 2001 (27.02.200n 



Name and mailing address of tbe ISA/US 

Comnbsioocr of Pisents and Trademarks 
Box PCTT 

Washingion, D,C. 20231 

Facsimile No. (703)305-3230 



Date of °^^&£^Jj^J^^!!^^^ |eafch report 




R. Korzuch 
Telephone No. (703)305-4700 



Form PCT/lSA/210 (second sheet) (July 1998) 



THIS PAGE BLANK (usPTd) 



This Page is Inserted by IFW Indexing and Scanning 
Operations and is not part of the Official Record 

BEST AVAILABLE IMAGES 

Defective images within this document are accurate representations of the original 
documents submitted by the applicant. 

Defects in the images include but are not limited to the items checked: 

□ BLACK BORDERS 

□ IMAGE CUT OFF AT TOP, BOTTOM OR SIDES 

□ FADED TEXT OR DRAWING • r---, ^ . . 

□ BLURRED OR ILLEGIBLE TEXT OR DRAWING 

□ SKEWED/SLANTED IMAGES 

□ COLOR OR BLACK AND WHITE PHOTOGRAPHS 
□/GRAY SCALE DOCUMENTS 

EI LINES OR MARKS ON ORIGINAL DOCUMENT 

□ REFERENCE(S) OR EXHIBIT(S) SUBMITTED ARE POOR QUALITY 

□ OTHER: 

IMAGES ARE BEST AVAILABLE COPY. 
As rescanning these documents will not correct the image 
problems checked, please do not report these problems to 
the IFW Image Problem Mailbox. 



THIS PAGE BLANK (usfto) 



